CC4.5: cost-sensitive decision tree pruning
نویسندگان
چکیده
There are many methods to prune decision trees, but the idea of cost-sensitive pruning has received much less investigation even though additional flexibility and increased performance can be obtained from this method. In this paper, we introduce a cost-sensitive decision tree pruning algorithm called CC4.5 based on the C4.5 algorithm. This algorithm uses the same method as C4.5 to construct the original decision tree, but the pruning methods in CC4.5 are different from that in C4.5. CC4.5 includes three cost-sensitive pruning methods to deal with misclassification cost in the decision tree. Unlike many other pruning algorithms, CC4.5 uses intelligent inexact classification to consider both error and cost when pruning. Moreover, experiments show that CC4.5 results in improved decision trees with respect to the cost and its comprehensibility and accuracy are also satisfactory.
منابع مشابه
Use of Expert Knowledge for Decision Tree Pruning
Decision tree technology has been proven to be a valuable way of capturing human decision making within a computer. One main problem for many traditional decision tree pruning methods is that it is always assumed that all misclassifications are equally probable and equally serious. However, in a real-world classification problem, there may be a cost associated with misclassifying examples from ...
متن کاملDecision Tree Pruning Using Expert Knowledge
Decision tree technology has proven to be a valuable way of capturing human decision making within a computer. It has long been a popular artificial intelligence(AI) technique. During the 1980s, it was one of the primary ways for creating an AI system. During the early part of the 1990s, it somewhat fell out of favor, as did the entire AI field in general. However, during the later 1990s, with ...
متن کاملCost-Sensitive Decision Trees with Pre-pruning
This paper explores two simple and efficient pre-pruning strategies for the cost-sensitive decision tree algorithm to avoid overfitting. One is to limit the cost-sensitive decision trees to a depth of two. The other is to prune the trees with a pre-specified threshold. Empirical study shows that, compared to the error-based tree algorithm C4.5 and several other cost-sensitive tree algorithms, t...
متن کاملCost-sensitive C4.5 with post-pruning and competition
Decision tree is an effective classification approach in data mining and machine learning. In applications, test costs and misclassification costs should be considered while inducing decision trees. Recently, some cost-sensitive learning algorithms based on ID3 such as CS-ID3, IDX, λ-ID3 have been proposed to deal with the issue. These algorithms deal with only symbolic data. In this paper, we ...
متن کاملCost-sensitive Decision Trees with Post-pruning and Competition for Numeric Data
Decision tree is an effective classification approach in data mining and machine learning. In some applications, test costs and misclassification costs should be considered while inducing decision trees. Recently, some cost-sensitive learning algorithms based on ID3, such as CS-ID3, IDX, ICET and λ-ID3, have been proposed to deal with the issue. In this paper, we develop a decision tree algorit...
متن کامل